Convert HTML entities from XML in JavaScript
April 15, 2009 at 11:29 pm Taco Fleur 1 comment
Sometimes you get passed a string from XML and it can contain HTML entities like the following
&
<
>
"
©
®
«
»
'
If you write the string with JavaScript then you get something like didn't instead of didn’t
I’ve searched around for a function that could handle this but could not find one, so I wrote my own. I thought I’d share it with the world!
The following will convert & to &, convert ' to ‘ etc.
Following is the code, if you need help implementing it, I’m more than happy to explain how you install and run this function in return for a text link on your website, now that’s cheap as, considering I normally charge $140 an hour 😉
String.prototype.trim = function () { return this.split( /\s/ ).join( " " ); } String.prototype.convertHTMLEntity = function () { var myString = this; myString = myString.replace( /\&/g, '&' ); myString = myString.replace( /\</g, '<' ); myString = myString.replace( /\"/g, '"' ); myString = myString.replace( /\©/g, '©' ); myString = myString.replace( /\®/g, '®' ); myString = myString.replace( /\«/g, '«' ); myString = myString.replace( /\&raqou;/g, '»' ); myString = myString.replace( /\'/g, "'" ); return myString; }
Entry filed under: JavaScript. Tags: convert, decode, entity, html, xml.
1. Andrew Zaborowski | September 28, 2009 at 11:44 pm
Note that there are a couple of thousands more of these entities defined in the standards and they’re often used in non-english html, I wouldn’t want to have a full list of them in the javascript in my page. (They’re called XML entities, not HTML entities)
Also, a more efficient implementation would look for any & in the CDATA part of the document (i.e. not between a ) and then take the part between the & and ; (e.g. “apos”) and look it up in a table. A look up in a table implemented as an Object is extremely cheap.
LikeLike